Extracting Paraphrases of Japanese Sentence Ending Part From Web and Mobile News Articles

ثبت نشده
چکیده

In this research, we extract paraphrases from Japanese Web news articles that are long and aimed at displaying on personal computer screens and mobile news articles that are short and compact and aimed at mobile terminals’ small screens. We have collected them for more than two years, and aligned them at article level and then at sentence level. As the result, we got more than 88,000 pairs of aligned sentences. Next, we extract paraphrases of the final part of sentences from this aligned corpus. The paraphrases that we try to extract are the sentence final nouns of mobile article sentences and their counterpart expressions of Web article sentences. We extract character strings and word sequencies for paraphrases based on branching factor, frequency and length of string. The precision is 90% for highest ranked candidate and 83% to 59% for each top three candidates of 100 most frequently used action nouns.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extracting Paraphrases of Japanese Action Word of Sentence Ending Part from Web and Mobile News Articles

In this research, we extract paraphrases from Japanese Web news articles that are long and aimed at displaying on personal computer screens and mobile news articles that are short and compact and aimed at mobile terminals’ small screens. We have collected them for more than two years, and aligned them at article level and then at sentence level. As the result, we got more than 88,000 pairs of a...

متن کامل

Terminal Device Oriented Comparable Corpora and its Alignment- Towards Extracting Paraphrasing Patterns

Many terminal devices for mobile environment such as mobile phones have small and low resolution screens compared to the big and high resolution screen of personal computers. In this circumstance, Web pages for ordinary personal computer and mobile phones written in the same language are developed separately even though they describe the same topic or contents. In this research, we collected We...

متن کامل

Automatic Paraphrase Acquisition from News Articles

Paraphrases play an important role in the variety and complexity of natural language documents. However they adds to the difficulty of natural language processing. Here we describe a procedure for obtaining paraphrases from news article. A set of paraphrases can be useful for various kinds of applications. Articles derived from different newspapers can contain paraphrases if they report the sam...

متن کامل

ارائه سیستم خلاصه ساز متون فارسی برمبنای ویژگی های زبان شناختی و رگرسیون

Considering the vast amount of existing written information and the shortage of time, optimal summarization of books, articles, news reports, etc. on the Web is a major concern of researchers. In this paper, we propose a new approach for Persian single-document Summarization based on several linguistic features of text. In our approach after extracting the linguistic features for each sentence,...

متن کامل

Paraphrasing Headlines by Machine Translation

In this paper we investigate the automatic collection, generation and evaluation of sentential paraphrases. Valuable sources of paraphrases are news article headlines; they tend to describe the same event in various different ways, and can easily be obtained from the web. We describe a method for generating paraphrases by using a large aligned monolingual corpus of news headlines acquired autom...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004